AITopics

2512.05525

Country: North America > United States (0.46)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-27-2024

HM3: Heterogeneous Multi-Class Model Merging

Hackmann, Stefan

Foundation language model deployments often include auxiliary guard-rail models to filter or classify text, detecting jailbreak attempts, biased or toxic output, or ensuring topic adherence. These additional models increase the complexity and cost of model inference, especially since many are also large language models. To address this issue, we explore training-free model merging techniques to consolidate these models into a single, multi-functional model. We propose Heterogeneous Multi-Class Model Merging (HM3) as a simple technique for merging multi-class classifiers with heterogeneous label spaces. Unlike parameter-efficient fine-tuning techniques like LoRA, which require extensive training and add complexity during inference, recent advancements allow models to be merged in a training-free manner. We report promising results for merging BERT-based guard models, some of which attain an average F1-score higher than the source models while reducing the inference time by up to 44%. We introduce self-merging to assess the impact of reduced task-vector density, finding that the more poorly performing hate speech classifier benefits from self-merging while higher-performing classifiers do not, which raises questions about using task vector reduction for model tuning.

classifier, model search, text classifier, (14 more...)

2409.19173

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Oregon (0.04)
Europe > Monaco (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (0.93)
Research Report (0.67)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Škrlj, Blaž, Mramor, Blaž

OutRank: Speeding up AutoML-based Model Search for Large Sparse Data sets with Cardinality-aware Feature Ranking

arXiv.org Artificial IntelligenceSep-4-2023

The design of modern recommender systems relies on understanding which parts of the feature space are relevant for solving a given recommendation task. However, real-world data sets in this domain are often characterized by their large size, sparsity, and noise, making it challenging to identify meaningful signals. Feature ranking represents an efficient branch of algorithms that can help address these challenges by identifying the most informative features and facilitating the automated search for more compact and better-performing models (AutoML). We introduce OutRank, a system for versatile feature ranking and data quality-related anomaly detection. OutRank was built with categorical data in mind, utilizing a variant of mutual information that is normalized with regard to the noise produced by features of the same cardinality. We further extend the similarity measure by incorporating information on feature similarity and combined relevance. The proposed approach's feasibility is demonstrated by speeding up the state-of-the-art AutoML system on a synthetic data set with no performance loss. Furthermore, we considered a real-life click-through-rate prediction data set where it outperformed strong baselines such as random forest-based approaches. The proposed approach enables exploration of up to 300% larger feature spaces compared to AutoML-only approaches, enabling faster search for better models on off-the-shelf hardware.

automl, interaction, outrank, (15 more...)

2309.01552

Country: North America > United States > New York > New York County > New York City (0.05)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.70)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.54)

#artificialintelligenceNov-11-2021, 05:35:37 GMT

Automating Machine Learning Pipelines

Creating a Machine Learning model is a difficult task because we need to write a lot of code to try different models and find out the performing model for that particular problem. There are different libraries that can automate this process to find out the best performing Machine Learning model but they also require some coding. What if I tell you that we can run multiple AutoML algorithms to find out the best model architecture for classification problems in a single code cell? Model search helps in implementing AutoML for classification problems. It runs multiple ML algorithms and compares them with each other.

artificial intelligence, automating machine learning pipeline, machine learning, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Mar, Richard, Schulte, Oliver

Pre and Post Counting for Scalable Statistical-Relational Model Discovery

arXiv.org Artificial IntelligenceOct-19-2021

Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data. For example, a relational dependency pattern may stipulate that a user's gender is associated with the gender of their friends. As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts: the number of times a relational pattern is instantiated in a database. Previous work on propositional learning utilized pre-counting or post-counting to solve this task. This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning. A pre-counting approach computes and caches instantiation counts for a large set of relational patterns before model search. A post-counting approach computes an instantiation count dynamically on-demand for each candidate pattern generated during the model search. We describe a novel hybrid approach, tailored to relational data, that achieves a sweet spot with pre-counting for patterns involving positive relationships (e.g. pairs of users who are friends) and post-counting for patterns involving negative relationships (e.g. pairs of users who are not friends). Our hybrid approach scales model discovery to millions of data facts.

database, model search, precount, (17 more...)

2110.09767

Country:

North America > Canada (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

#artificialintelligenceApr-1-2021, 14:30:12 GMT

🔎 👯 Edge#76: Google's Model Search is a New, Open-Source Framework for Finding Optimal ML Models

What's New in AI, a deep dive into one of the freshest research papers or technology frameworks that are worth your attention. Our goal is to keep you up to date with new developments in AI in a way that complements the concepts we are debating in other editions of our newsletter.

model search, open-source framework, optimal ml model, (2 more...)

Technology:

Information Technology > Communications > Social Media (0.43)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

#artificialintelligenceMar-8-2021, 20:55:22 GMT

Google's Model Search: An Open Source Platform for Finding Optimal ML Models

The above questions are quite tricky. As data scientists, the current approach is just to experiment with the possibilities that make more sense, evaluate, make another choice & repeat. This process…

algorithm, model search, search algorithm, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

arXiv.org Machine LearningJun-26-2020

AutoRec: An Automated Recommender System

Wang, Ting-Hsiang, Song, Qingquan, Han, Xiaotian, Liu, Zirui, Jin, Haifeng, Hu, Xia

For example, NCF [8] takes user-item implicit feedback data as inputs for the rating prediction task; and DeepFM [6] leverages both numerical and categorical data for the CTR prediction task. However, high degree of specialization comes at the expense of model adaptability and tuning complexity. As recommendation tasks evolve over time and additional types of data are collected, the originally apt model can either become obsolete or require tremendous tuning efforts. So far, several pipelines for recommender systems, e.g., OpenRec [16] and SMORe [4], tried to address the adaptability issue via providing modular base blocks that can be selected according to the context of recommendation. Nevertheless, both determining the blocks to use and tuning the model parameters are not straightforward when facing new data and changing tasks. In order to bridge the gap, we present AutoRec, which aims to provide an end-to-end solution to automate model selection and hyperparameter tuning. While many AutoML libraries, such as Auto-Sklearn [5] and TPOT [12] have shown promising results in general-purpose machine learning tasks (e.g., regression and hyperparameter tuning) and

autorec, prediction task, recommendation model, (15 more...)

arXiv.org Machine Learning

2007.07224

Country:

North America > United States > Texas (0.07)
North America > United States > Florida > Hillsborough County > University (0.07)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceJan-12-2020, 18:21:24 GMT

AutoML on Databricks: Augmenting Data Science from Data Prep to Operationalization - The Databricks Blog

Thousands of data science jobs are going unfilled today as global demand for the talent greatly outstrips supply. Every day, businesses pay the price of the data scientist shortage in missed opportunities and slow innovation. For organizations to realize the full potential of machine learning, data teams have to build hundreds of predictive models a year. For most enterprises, only a fraction of that number is actually achieved due to understaffed data science teams. Databricks can help data science teams be more productive by automating various steps of the data science workflow – including feature engineering, hyperparameter tuning, model search, and deployment – for a fully controlled and transparent augmented ML experience.

databrick, hyperparameter, model search, (11 more...)

Genre: Workflow (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.99)